Accent Sandhi Estimation of Tokyo Dialect of Japanese Using Conditional Random Fields

نویسندگان

  • Masayuki Suzuki
  • Ryo Kuroiwa
  • Keisuke Innami
  • Shumpei Kobayashi
  • Shinya Shimizu
  • Nobuaki Minematsu
  • Keikichi Hirose
چکیده

When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper describes a statistical method for automatically predicting the accent nucleus changes due to accent sandhi. First, as the basis of the research, a database of Japanese text was constructed with labels of accent phrase boundaries and accent nucleus positions when uttered in sentences. A single native speaker of Tokyo dialect Japanese annotated all the labels for 6,344 Japanese sentences. Then, using this database, a conditional-random-field-based method was developed using this database to predict accent phrase boundaries and accent nuclei. The proposed method predicted accent nucleus positions for accent phrases with 94.66% accuracy, clearly surpassing the 87.48% accuracy obtained using our rule-based method. A listening experiment was also conducted on synthetic speech obtained using the proposed method and that obtained using the rule-based method. The results show that our method significantly improved the naturalness of synthetic speech. key words: Japanese text-to-speech, accent sandhi, accent phrase boundary estimation, accent type estimation, conditional random field

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CRF-based statistical learning of Japanese accent sandhi for developing Japanese text-to-speech synthesis systems

In Japanese, every content word has its own H/L pitch pattern when it is uttered isolatedly, called accent type. In a TTS system, this lexical information is usually stored in a dictionary and it is referred to for prosody generation. When converting a written sentence to speech, however, this lexical H/L pattern is often changed according to the context, known as word accent sandhi. This accen...

متن کامل

Improved Prediction of Japanese Word Accent Sandhi Using CRF

In Japanese, every content word has its own mora-based H/L pitch pattern when it is uttered in isolation, called accent type. When reading out a written sentence, however, this lexical H/L pattern is often changed according to the context, known as word accent sandhi. In our previous work, an accent sandhi predictor was developed using CRF [1], and in this paper, the predictor is improved throu...

متن کامل

Improvement of CRF-Based Accent Sandhi Prediction Using The Features Derived from Accent Rules

When developing Japanese text-to-speech (TTS) systems, algorithms to accurately predict accent types of each constituent phrase is essential for better output speech quality. In our previous studies on the accent type estimation, a CRF-based method was realized. Although this method outperformed the conventional rule-based method, the estimation accuracy of particular phrases such as those incl...

متن کامل

Manifestation of downstep and intonation in Japanese: comparison of the Tokyo and Kochi dialects

This paper examines the manifestation of downstep and intonation in the Tokyo and Kochi dialects of Japanese by using three types of syntactically balanced material adjective phrases, adverbial phrases, and sentence modifiers. The main conclusion is that Kochi speakers produce a smaller Major Phrase consisting of fewer lexical accents than in the Tokyo dialect, the Major Phrase being defined as...

متن کامل

The role of Japanese pitch accent in spoken-word recognition: evidence from middle-aged accentless dialect listeners

This paper investigates the role of pitch accent information in spoken-word recognition in listeners in Fukushima, Japan, whose dialect is accentless. Previous research revealed that accentless listeners were less sensitive to pitch accent than Tokyo Japanese listeners. The present study asked whether middle-aged listeners’ use of accent information would differ from that of young listeners. 40...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 100-D  شماره 

صفحات  -

تاریخ انتشار 2017